Skip to content

fix: raise ValueError for unsupported MIME types in file_data URI path#4

Open
Raman369AI wants to merge 28 commits into
mainfrom
fix/file-uri-unknown-mime-type-error
Open

fix: raise ValueError for unsupported MIME types in file_data URI path#4
Raman369AI wants to merge 28 commits into
mainfrom
fix/file-uri-unknown-mime-type-error

Conversation

@Raman369AI
Copy link
Copy Markdown
Owner

Relates to google#5022 and upstream PR google#5023

Problem

When a Part with file_data.file_uri has no determinable MIME type, the library fell back to application/octet-stream via _DEFAULT_MIME_TYPE. This value propagated to LiteLLM which raised a cryptic internal ValueError with no guidance for the user.

The same failure occurred when the caller explicitly set mime_type = "application/octet-stream" on the Part. Both cases reach the same failure point.

There was also an inconsistency between the two content paths:

  • The inline_data path raises ValueError immediately for unsupported MIME types
  • The file_data path silently used a fallback and failed later with a cryptic message

GcsArtifactService generates URIs like gs://bucket/artifact/0 with no extension and no MIME type, making ADK's own artifact system the primary trigger for this fallback.

Fix

Removes _DEFAULT_MIME_TYPE and raises ValueError early with an actionable message when the resolved MIME type is either unknown or application/octet-stream. This aligns the file_data path with the existing fail-fast behavior of the inline_data path.

The logic order is also corrected so providers that always produce a text fallback (anthropic, non-Gemini Vertex AI) and OpenAI/Azure HTTP media URLs are handled before the MIME type guard, keeping those paths unaffected.

Changes

  • src/google/adk/models/lite_llm.py: remove _DEFAULT_MIME_TYPE, restructure file_uri handling block, raise ValueError for missing or generic MIME types
  • tests/unittests/models/test_litellm.py: update two existing tests to assert the new ValueError, add one new test covering explicit application/octet-stream

Testing

pytest tests/unittests/models/test_litellm.py
241 passed, 5 errors (pre-existing, missing pytest-mock fixture)

Format verified with pyink. mypy comm -13 simulation: zero new errors.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the ADK to version 1.28.0, introducing significant security, concurrency, and stability enhancements. Key improvements include origin-check middleware for HTTP and WebSockets, strict agent name validation to prevent arbitrary imports, and the restriction of builder endpoints to web-enabled environments. The update also resolves concurrency races in session state management, adds regional support for Discovery Engine, ensures correct Pydantic required field parsing, provides consistent function call ID generation, and pins the LiteLLM dependency to a secure version. Feedback was provided regarding the risk of silent data loss when configuring models to ignore extra fields, suggesting a more explicit approach to unknown data.


model_config = ConfigDict(
extra='forbid',
extra='ignore',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Changing extra='forbid' to extra='ignore' in model_config can lead to silent data loss if external data sources introduce new fields that are not explicitly handled by the model. While it can improve compatibility, it's important to be aware of the trade-offs. Consider if extra='allow' or a more explicit handling of unknown fields would be more appropriate depending on the expected data evolution.

DeanChensj and others added 28 commits May 12, 2026 21:34
Merge google#5679

Same config as in adk-python-community

COPYBARA_INTEGRATE_REVIEW=google#5679 from DeanChensj:main d33c368
PiperOrigin-RevId: 914634367
This CL introduces logic to handle scenarios where a non-ADK agent transitions to the `TaskState.input_required` or `TaskState.auth_required` states. It intercepts these events and converts them into a synthetic ADK `FunctionCall` event.

PiperOrigin-RevId: 914877123
Co-authored-by: Kathy Wu <wukathy@google.com>
PiperOrigin-RevId: 914934690
…same invocation

Currently, BaseToolset caches its tools per invocation_id. Because SkillToolset dynamically resolves additional tools from the state when a skill is loaded, the cache prevents new tools from being picked up in the same invocation right after a load_skill call. This change sets `_use_invocation_cache = False` in SkillToolset so that it correctly re-evaluates the state-dependent tools at each step of the LLM generation loop within an invocation, preventing "Tool not found" errors.

PiperOrigin-RevId: 914997555
…ling

This change removes support for the deprecated `--session_db_url`, `--artifact_storage_uri`, and `--verbosity` flags from the ADK CLI. It also simplifies the service URI handling in `cli_deploy.py` by always using the new `--session_service_uri`, `--artifact_service_uri`, and `--memory_service_uri` flags, regardless of the ADK version.

The deprecated flags has been more than 1 year

Co-authored-by: Shangjie Chen <deanchen@google.com>
PiperOrigin-RevId: 915100881
This is related to google#5327.
`BaseToolset.get_auth_config()` returns `None` by default and removing its overrides in toolsets that don't need OAuth flows to list tools does the job.

No regressions in unit tests (`pytest tests/unittests/auth`):
```
================================================================= 181 passed, 508 warnings in 4.23s =================================================================
```

PiperOrigin-RevId: 915128706
This change enables the Google Cloud Telemetry exporter to use mTLS endpoints. It checks for the availability of client certificates and respects the GOOGLE_API_USE_CLIENT_CERTIFICATE environment variables to determine whether to use the mTLS-specific endpoint and configure the session accordingly.

PiperOrigin-RevId: 915541335
…confusion from BaseToolset's invocation_cache

Co-authored-by: Kathy Wu <wukathy@google.com>
PiperOrigin-RevId: 915578052
This CL implements the  class in the integrations folder, used specifically for the Skill Registry API.

Co-authored-by: Kathy Wu <wukathy@google.com>
PiperOrigin-RevId: 915627057
…unity alignment

- Restrict invoke and review triggers purely to explicit user comments.
- Enforce strict author association verification (OWNER, MEMBER, COLLABORATOR).
- Enforce strict targeting assertion to ensure pull requests act on the main branch.
- Synchronize prompt constraints and GitHub action tools with the community catalog.
- Refine action API key options to uniformly target secrets.GOOGLE_API_KEY.

Co-authored-by: Shangjie Chen <deanchen@google.com>
PiperOrigin-RevId: 915654346
… double-escaping

PiperOrigin-RevId: 916037238
This ensures that adding README.md files to subdirectories (as discussed
for new folders and integrations) won't result in them being included in
the published package.

Co-authored-by: George Weale <gweale@google.com>
PiperOrigin-RevId: 916075206
When judge_model_config is None, LlmRequest raises a ValidationError
because it requires a config. We now construct a default GenerateContentConfig
if one isn't provided.

Close google#5677

Co-authored-by: George Weale <gweale@google.com>
PiperOrigin-RevId: 916087055
…actions

This change introduces logic to identify events containing requests for tool confirmation or auth credentials. The compaction process will now stop before any such "Human-in-the-Loop" (HITL) events, ensuring that the full context of the interaction is preserved and not summarized away. This applies to both sliding window and token threshold compaction strategies.

Co-authored-by: George Weale <gweale@google.com>
PiperOrigin-RevId: 916108771
`part_to_message_block` iterated `content` char-by-char when a tool
returned it as a plain string (e.g. `LoadSkillResourceTool`'s
`{"content": <file text>}`), producing `"H\ne\nl\nl\no"` instead of
`"Hello"`. Guard the list branch with `isinstance(..., list)` and add
a sibling branch that passes a scalar string through directly, matching
Anthropic's `content: str | list[ContentBlockParam]` shape.

Close google#5358

Co-authored-by: George Weale <gweale@google.com>
PiperOrigin-RevId: 916109239
Co-authored-by: Amaad Martin <amaadmartin@google.com>
PiperOrigin-RevId: 916112779
…ack as successful

Previously, an empty `candidates` list without `prompt_feedback` resulted in an `UNKNOWN_ERROR`. This change updates the logic to handle such cases as a successful completion with no generated content, which is valid for certain model interactions like tool-driven turns.

Co-authored-by: George Weale <gweale@google.com>
PiperOrigin-RevId: 916115022
…rt ~8%

Co-authored-by: George Weale <gweale@google.com>
PiperOrigin-RevId: 916195791
…ntTool

AgentTool.run_async only extracted text parts from the inner agent's
response, silently dropping code_execution_result.output and
executable_code.code. Outer agents using an inner agent with a code
executor saw nothing.

Close google#5481

Co-authored-by: George Weale <gweale@google.com>
PiperOrigin-RevId: 916196604
Co-authored-by: Kathy Wu <wukathy@google.com>
PiperOrigin-RevId: 916198410
Adds @functools.lru_cache to find_context_parameter so the inspect.signature
+ typing.get_type_hints lookup runs once per function, not on every MCP
confirmation callback or declaration build. No public surface change.

Co-authored-by: George Weale <gweale@google.com>
PiperOrigin-RevId: 916204929
… MCP session creation failure

Co-authored-by: Sasha Sobran <asobran@google.com>
PiperOrigin-RevId: 916220631
* Implements live inference in evaluation_generator.py using Runner.run_live().
* Updates base_eval_service.py, local_eval_service.py to support live mode data structures and connection handling.

Testing: Added unit tests:
* test_generates_inferences_with_user_simulator_live
* test_live_session_manually_triggers_callbacks
* test_live_session_manually_triggers_callbacks_with_tools
* test_perform_inference_with_use_live
* test_perform_inference_single_eval_item_live
* test_perform_inference_single_eval_item_non_live
PiperOrigin-RevId: 916231029
The file_data.file_uri path silently fell back to application/octet-stream
when no MIME type could be determined, then passed it to LiteLLM which raised
a cryptic internal ValueError. The inline_data path already had fail-fast
behavior for unsupported types but the file_data path did not.

This change removes the _DEFAULT_MIME_TYPE fallback and raises ValueError early
with an actionable message for two cases: when no MIME type can be determined
from the URI, display_name, or explicit field, and when the resolved type is
application/octet-stream regardless of whether it was set by the caller or
arrived via a library default. Both cases cause the same downstream failure.

The logic order is also corrected so that providers which always produce a
text fallback (anthropic, non-Gemini Vertex AI) and OpenAI/Azure HTTP media
URLs are handled before the MIME type guard, keeping those paths unaffected.

Tests are updated to assert the new ValueError and a new test covers the
explicit application/octet-stream case.
Adds test_content_to_message_param_user_message_file_uri_explicit_octet_stream
to confirm that an upstream caller passing mime_type='application/octet-stream'
raises a clear ValueError, covering both branches of the combined guard.

Fixes: google#5022
@Raman369AI Raman369AI force-pushed the fix/file-uri-unknown-mime-type-error branch from d943bdd to c7eb917 Compare May 15, 2026 23:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants